Neuroadaptive system for modelling human cognition under a Bayesian framework during online EEG simulation

Mathilde Marie Duville

Phase Amplitude Coupling - Theta/Gamma (brain-brain coupling)

Cortical oscillations reflect the high dimensionality of speech processing, from sensorial decoding to the access of higher-order internal inferences. However, they substantiate neural activity both at rest and under sensorial and cognitive processing. While listening to speech, delta, theta, and gamma oscillators synchronize with the pseudo-rhythmicity of linguistic units.
Auditory speech encompasses temporal fluctuations within 1-40 Hz that are closely linked to prosodic, syllabic, and phonological attributes, essential for comprehension. First, intonation patterns are slow delta rhythms (~1-3 Hz) that follow boundary movements based on F0 unprompted fluctuations and pauses. Second, the syllabic rate oscillates within the theta rhythm (~4-8 Hz), constrained by motor properties of articulators. Finally, the fine structure of speech that encodes short-term phonemic cues appears with a high modulation frequency, typically within high-beta and low-gamma rhythms (~25-40 Hz).
During perception, oscillations may reflect several cognitive aspects of language processing, from sensorial tracking to endogenous mechanisms responsible for higher-level abstract understandings. In this respect, the Tempo model highlights a network of nested rhythms, including a theta oscillator, able to track the syllabic rate of the input, and high-beta/low-gamma oscillations whose fluctuations cascade at frequencies that are multiples of the theta rhythm. Those inter-oscillator dependencies control the decoding of speech information for a more abstract encoding of stimuli.
Theta oscillations may regulate gamma activity within a phase-amplitude coupling (PAC). Notably, the theta phase rhythm aligns with the syllabic contour of speech, and for every positive (or negative) phase, gamma amplitude increases (or decreases), hence inlying phonemic representations into syllabic analyses.
Gamma and theta activities are well documented to be organized within Pyramidal Interneuron Gamma (PING) and Theta (PINT) networks. Both populations of pyramidal neurons from PING and PINT are innervated by subcortical auditory channels that reflect the nonlinear filtering of acoustic input by the cochlea. The pyramidal neurons of PING receive specific projections from single cochlear spectral decomposition whereas excitatory neurons from PINT receive wideband non-specific projections from all auditory channels. Oscillations at both frequencies are generated by the two-way interaction between excitatory pyramidal neurons and inhibitory interneurons. Both networks interact via an excitatory connection between theta and gamma pyramidal neurons. Firing patterns and synchronicity between networks are sparse and weak at rest but become stronger during speech processing. The excitation of PING by PINT sets periods of excitability driven by the release of theta burst between syllables that time-locks phonemic representations to syllables onset. The network outputs neuronal firing patterns operating together to optimize the integration of linguistic representations.

Mutual information to quantify the amount of shared information between two time series

The Mutual Information (MI) is chosen because (a) both linear and non-linear dependencies are captured, (b) all types of synchronizations can be assessed: phase-phase, phase-frequency, phase-amplitude, amplitude-frequency, amplitude-amplitude, and frequency-frequency, which make possible the extension of this simulation whenever necessary. Besides, a major advantage is that MI is independent of nature of the dependency between time series (linear or non-linear). Therefore, MI may be useful for signals approximating sine waves or for phase angles, whose distribution are nonlinear. Besides, it may be used to assess dependencies between two time series with different distributions, which is the case between the amplitude on the one hand, and the phase on the other hand.
To compute MI:
  1. Compute Shannon entropy for each time series (expressed in bits). It relies on binning the time series to create a histogram, so that it only relies on probabilities, and is independent of temporal, amplitude modulations, and scaling:
where is the probability of observing the ith value of the bin series signal X, and N is the number of bins. is calculated by dividing the bin count by the sum of all bin counts.
Freedman-Diaconis rule to define the optimized number of bins:
where Q is the interquartile range of time series distribution X, n is the total number of samples. indicates that the ceiling of the result must be considered.
2. Calculate the joint entropy
where is the probability of observing both ith and jth values of the bin series signals X and Y, Nis the number of bins used forX, and M is the number of bins used forY.
3. Compute the MI:
Compute the factor of inflation (a low number of samples may lead to the overestimation of entropy) to subtract it from the original MI:
where is the number of bins used to compute MI, n is the number of samples of time series, and is the natural logarithm of 2.

Bayesian framework

Concept

Action and perception interact to minimize the expected surprise (i.e., prediction error or inverse likelihood).
At neural levels, predictive coding suggests the constant generation of top-down inferences by higher-brain regions to build hidden-state hypotheses about the causes of sensory experiences, and depict differences between sensory inputs and expectations (i.e., prediction error).
In that respect, actions sample sensory inputs via controlling the precision (i.e., confidence for information) to conform predictions and maximize the likelihood. Importantly, precision itself must be previously estimated towards the optimization of given goals. That is, the weighting of new evidence against prior beliefs must be dynamically adjusted according to context. For instance, if you are listening to speech, trying to infer emotional states from prosodic contours, you may sample the most salient sensory information (e.g., loudness and speech rate). Thus, you would be increasing the informativeness (i.e., precision) of those acoustic features, relative to other irrelevant cues (e.g., frequency of harmonics), following your prior knowledge about emotion identification. You would also be decreasing the precision to the sensory variability relative to prior inferences. Therefore, perception is a constructive process by which information is weighted to optimize behavior relative to the degree of precision (or uncertainty) of both prior knowledge and sensory evidence.
Particularly, when the prediction error is higher than the expected variability of inputs (e.g., cue-outcome is unexpected based on the confidence previously ascribed to contextual volatility and variability), priors are updated to optimize internal modelling and solve future discordances. That is, whenever the sampled sensory evidence does not match the top-down inferences (i.e., the cue is not predictive of the emotional prosody anymore), you would reevaluate the weighting between bottom-up and top-down information. Therefore, high sensory precision would increase the influence of bottom-up affordances by promoting the learning rate of contextual variability (i.e., the receptiveness to sensory variability and volatility), thus lessening the surprise to all inputs, and broadening the uncertainty of internal states. Contrarywise, higher precision to priors would decrease the expected variability of inputs, promote thelearning of probabilistic distributions of variants, and bias perception towards internal representations.

Statistical representation

Bayesian statistics focus on subjective probability, considering a priori (priors) predictions. The perspective of the subject is the basis of calculations. The uncertainty of the subject is assessed, that is the probability distribution of a parameter.
Mathematically, the Bayes theorem is represented as follows:
where θ is the parameter (e.g., the proportion of heads when flipping a coin), and D is the observed sample (the evidence).
That is, the probability distribution of the parameter given the sample is proportional to the probability of observing the sample given the parameter times the probability of the parameter. The numerator is devided by the probability of observing the sample, independently from the parameter: this is a scaling factor, so that the probability distribution of the parameter given the sample lies between 0 and 1.
Terminology:
In this case, the histogram representation of the prior distribution would be:
The likelihood distribution informs about the precision towards the prior or the evidence to update the prior distribution into the posterior. It represents the weighting of new evidence against prior beliefs that must be dynamically adjusted according to the context.

Probabilistic modelling of user's cognition (prior, likelihood, and posterior inferences) during online simulation

A real-time electroencephalographic data acquisition (EEG) while listening to one-word utterances in-between resting periods is simulated. The speech stimulus is a list of 6 words. A silence of 2 seconds ("rest period") lies between each word. The first word is uttered after a 4-sec silent window, and 2 seconds of silence are available after the last word.
A buffer-like moving window (1s) with a 50% overlap (500 ms) is used to simulate real-time acquisition. The window continuously and homogeneously retrieves EEG and speech data sample by sample.
The synchronization between theta phase and gamma amplitude of neural oscillators is assessed both during rest and speech processing. For adequate linguistic comprehension, higher mutual information (MI) is expected during speech processing.
Here is modeled a learning environment where the user must train to increase theta-gamma phase-amplitude coupling when listening to speech.
The system automatically detects the start and the end of the environmental stimulus (the utterance) in real time.
When the first word appears, the MI is computed and a positive feedback (a positively connoted image) is displayed whenever the MI is higher than the one calculated during rest. Otherwise, a negative feedback (a negatively connoted image) appears.
Before the first sample of evidence, the prior expectation of positive or negative feedback is modeled to be fair (i.e., refer to the previous section and the example of coin tossing). The likelihood distribution is computed according to the evidence, consiering θ as the proportion of positive feedback. The posterior distribution that represents the modelling of the user's inferences is then computed, and the probability of success is updated according to the evidence.
When following words are present in the user's environment, a positive image is intended to be fed back whenever the MI is higher than during rest and higher than during preceding words. Otherwise, a negative image is intented to be fed back. This condition was established to simulate the learning requirement expressed by the improvement in MI. When this neural condition is met, the probability of positive feedback increases. However, a positive feedback is not necessarily displayed with a probability of 1 because it depends on the user's previous learning (i.e., the posterior inference of probability of sucess).The higher past improvement, the higher probability of displaying a positive feedback.
Therefore, when a new utterance appears in the user's environment, 4 outputs are possible:
Whenever "the neural condition of learning are met and a negative image is fed back to the user" or "the neural condition of learning are not met and a positive image is fed back to the user", the precision (i.e., weight) towards the evidence relative to priors is adjusted to the context. That is, internal cognitive processes are considered beyond the sensory evidence and the precision towards the sensory evidence (the feedback) is decreased. In particular, the likelihood distribution is modeled to follow a root square distribution, where the probability of the evidence is lesser than 1, and a non-linear decrease of probability (slower than linear) towards higher proportion of the adequate feedback is considered. The histogram representation would be:
Theta phase and gamma amplitude information have been beforehand computed within a Simulink modelling of neural oscillators' activity during speech processing. Please refer to Chapter 9 of the PhD thesis titled "Improved emotion recognition in speech by autistics: design, validation, and implementation of acoustically modulated prosodies" by Mathilde Marie Duville, available at: https://drive.google.com/file/d/1WdKHWyGWM0Su7g7ZjiQ8RZMPWeE7tldg/view
Note: This code is applied to a form of brain-computer interaction that does not require any explicit input from the user, making it neuroadaptive. Besides, the implicit interaction with the type of image feedback is ensured by adapting the probability of apparition of positive or negative feedback autonomously based on the modelling of user's inferences and EEG data. Thus, it is an example of automated adaptation where the system learns over time the pattern of user's cognition and adapts its response to it.
Initiate variables
clear;clc; close all;
 
% Theta phase and gamma amplitude information
load('fs.mat'); load('GammaAmplitude.mat'); load('ThetaPhase.mat')
 
% Speech information
filename = '6words_24bit.wav'; [StimulusSpeech, fsSpeech] = audioread(filename);
L = length(StimulusSpeech); timevct = linspace(0,L/fsSpeech,L); time = L/fsSpeech;
 
% Play the speech audio
% player=audioplayer(StimulusSpeech,fsSpeech); play(player);
Configure parameters for online simulation
% Parameters for online simulation
window_size = fs; % Window size of buffer-like moving window (1 second)
timevctwind = linspace(0,window_size/fs,window_size);
overlap = window_size / 2; % 50% overlap
duration = time; % Duration of the simulation (seconds)
 
% Initialize buffers
buffer_size = window_size * 2; % Buffer size to hold two windows worth of data
eeg_bufferGamma = zeros(buffer_size, 1); eeg_bufferTheta = zeros(buffer_size, 1);
eeg_bufferSpeech = zeros(buffer_size, 1);
 
% Initiate variables
mi_corrected_Word = cell(1,1); %store MI values for all time windows of interest
mi_corrected_Rest = cell(1,1);
idx_word = []; idx_rest = []; t_currentWindowTot = [];
pos_ok = []; pos_NotOk = []; neg_NotOk = []; ok = []; NotOk = []; neg_ok = [];
image_tot_pos = []; image_tot_neg = [];
 
% 0 = positive image, 1 = negative image
Feedback = [0, 1];
filenamesFeedback{1,1} = 'positive.jpg'; filenamesFeedback{1,2} = 'negative.jpg';
 
%prior probability of each image to be outputed. Start with fair probability
p0 = 0.5; p1=1-p0;
Initiate prior distribution
% binomial prior distribution
figure('WindowState','maximized')
x0BinoPrior = 0:1:10; %trials
xBinoPrior = 0:0.1:1; binoPrior = binopdf(x0BinoPrior,max(x0BinoPrior),p0);
bar(xBinoPrior,binoPrior, 'FaceAlpha',0.3)
xlabel('Proportion of positive image')
ylabel('Probability of image appearance')
title('Initial prior distribution')
 
% Save cumulated p0
p0_cum = zeros(1, L);
p0_cum(1,1) = p0;
Initiate visual representation of real-time update of probabilities of success
% plot real-time update of probabilities of success (p(positive image))
% The probability of success should be increasing over time with learning (with the improvement MI during word processing)
figure('WindowState','maximized')
h = plot(0,0);
xlim([-1 L/fs]); ylim([-0.1 1])
xlabel('Time (s)'); ylabel('Probability of positive feedback');
title('Real-time update of probability of success (p(positive image))')
Start online simulation
for t = 1:L % During all speech stimulus
% Simulate incoming data (new sample)
new_sampleGamma = GammaAmpl(t); new_sampleTheta = ThetaPhase(t);
new_sampleSpeech = StimulusSpeech(t);
 
% Shift buffer and add new sample
eeg_bufferGamma(1:end-1, :) = eeg_bufferGamma(2:end, :);
eeg_bufferGamma(end, :) = new_sampleGamma;
 
eeg_bufferTheta(1:end-1, :) = eeg_bufferTheta(2:end, :);
eeg_bufferTheta(end, :) = new_sampleTheta;
 
eeg_bufferSpeech(1:end-1, :) = eeg_bufferSpeech(2:end, :);
eeg_bufferSpeech(end, :) = new_sampleSpeech;
 
% Automatic detection of word appearance in user's environment
if new_sampleSpeech ~= 0
idx = t; % a word appears
idx_word=[idx_word;idx];
else idx = t;
idx_rest=[idx_rest;idx];
end
 
% Process data when we have enough samples for buffer-like moving one window
if mod(t, overlap) == 0 && t >= window_size
 
%every half window and starts when 1s acquired
% t_currentWindowTot = [t_currentWindowTot; t];
disp(['Buffer-like moving window at: ', num2str(t/fs), ' seconds'])
% Get current window of data
current_windowGamma = eeg_bufferGamma(end-window_size+1:end, :);
current_windowTheta = eeg_bufferTheta(end-window_size+1:end, :);
% current_windowStim = eeg_bufferSpeech(end-window_size+1:end, :);
 
% Identify samples that correspond to rest or speech processing in
% the current window
t_Window=t-window_size+1:t;
idx_word_currentWindow = find(ismember(t_Window, idx_word)); %at least part of a word is present in the current window
current_windowGammaWord = current_windowGamma(idx_word_currentWindow);
current_windowThetaWord = current_windowTheta(idx_word_currentWindow);
% current_windowStimWord = current_windowStim(idx_word_currentWindow);
 
idx_rest_currentWindow = find(ismember(t_Window, idx_rest));
current_windowGammaRest = current_windowGamma(idx_rest_currentWindow);
current_windowThetaRest = current_windowTheta(idx_rest_currentWindow);
 
% Compute MI during word processing
try [mi_Word, entropy, fd_bins_Word] = MI(current_windowGammaWord,...
current_windowThetaWord,[]);
% Having little amount of data may inflate MI.
% Compute inflation factor and subtract it from MI
n_Word = size(current_windowGammaWord,1);
mutinfo_error_Word = (fd_bins_Word-1).^2./(2.*n_Word.*log(2));
mi_corrected_Word{1,t} = mi_Word-mutinfo_error_Word;
mi_corrected_Word = mi_corrected_Word(~cellfun('isempty',mi_corrected_Word));
catch
mi_Word = NaN; entropy= NaN; fd_bins_Word = NaN;
end
 
% Compute MI during rest
try [mi_Rest, entropy, fd_bins_Rest] = MI(current_windowGammaRest,...
current_windowThetaRest,[]);
% Having little amount of data may inflate MI.
% Compute inflation factor and subtract it from MI
n_Rest = size(current_windowGammaRest,1);
mutinfo_error_Rest = (fd_bins_Rest-1).^2./(2.*n_Rest.*log(2));
mi_corrected_Rest{1,t} = mi_Rest-mutinfo_error_Rest;
mi_corrected_Rest = mi_corrected_Rest(~cellfun('isempty',mi_corrected_Rest));
catch
mi_Rest = NaN; entropy= NaN; fd_bins_Rest = NaN;
end
if ~isempty(current_windowGammaWord) && ~isempty(mi_corrected_Rest)
idx_currentWordWindow = size(mi_corrected_Word,2);
mi_rest = cell2mat(mi_corrected_Rest);
 
switch idx_currentWordWindow
case 1
% First window with a word. Output positive image with probability 1 if
% MI word > MI rest
if mi_corrected_Word{1,idx_currentWordWindow} > mean(mi_rest)
p0a = 1; p1a = 0;
feature = 'ok'; % conditions are met for a positive image
else
p0a = 0; p1a = 1;
feature = 'not ok'; % conditions are met for a negative image
end
Prob = [p0a, p1a];
 
cp = [0, cumsum(Prob)]; %cummulative summ of probabilities -
% if a image has higher probability, it will be more likely that a
% random number would be lower, so the index of r > cp from last
% item would be the one of its prob - 1 (because starts with 0)
 
r = rand; %random number between 0 and 1
ind = find(r>cp, 1, 'last'); image = Feedback(ind);
filename = filenamesFeedback{1,image+1};
imageDisp = imread(filename);
imageFeed = figure(); imshow(imageDisp);
disp(['Feedback: ', filename]);
 
switch image
case 0
% likelihood
BinoLikelihood0 = xBinoPrior;
param = BinoLikelihood0;
case 1
% likelihood
BinoLikelihood1 = xBinoPrior1(end:-1:1);
param = BinoLikelihood1;
end
 
% Output likelihood distribution
figure('WindowState','maximized')
bar(xBinoPrior,param,'FaceAlpha',0.3)
xlabel('Proportion of positive image'); ylabel('Probability of image appearance')
 
% Posterior
binoPosterior = (binoPrior.*BinoLikelihood0)/p0;
 
% Save cumulated posterior distributions
binoPosterior_cum{1,t} = binoPosterior;
binoPosterior_cum = binoPosterior_cum(~cellfun('isempty',binoPosterior_cum));
idx_currentbinoPosterior = size(binoPosterior_cum,2);
 
% Output posterior distribution
figure('WindowState','maximized')
bar(xBinoPrior,binoPrior,'FaceAlpha',0.3)
hold on
bar(xBinoPrior,binoPosterior,'FaceAlpha',0.3)
legend('Prior', 'Posterior')
xlabel('Proportion of positive image');ylabel('Probability of image appearance');
 
% Compute new probability of success
 
% Number of trials (from the length of pdf_values - 1)
n = length(binoPosterior) - 1;
% Objective function to minimize
objective_function = @(p) sum((binoPosterior - binopdf(0:n, n, p)).^2);
% Initial guess for p
p_initial = 0.5;
% Use fminsearch to find the optimal p
% Update p0
[p0,fval,exitflag,output] = fminsearch(objective_function, p_initial);
% Display the result
disp(['Posterior probability of success (p): ', num2str(p0)]);
% Update p1
p1 = 1-p0;
 
% Save cumulated p0
p0_cum(1,t) = p0;
 
% Update plot
set(h, 'XData', (1:t)/fs, 'YData', p0_cum(1:t));
drawnow;
otherwise %Not first word window
close(imageFeed); %remove previous feedback
 
mi_corrected_WorddB = cell2mat(mi_corrected_Word);
 
% if MI greater than rest AND greater than the average
% past MI during word. There was an improvement in MI
if mi_corrected_Word{1,idx_currentWordWindow} > mean(mi_rest) && ...
mi_corrected_Word{1,idx_currentWordWindow} >...
mean(mi_corrected_WorddB(1:idx_currentWordWindow-1))
 
% increases the probability of positive feedback according to EEG data
% probability of positive feedback = p0 (posterior modelling of the user)
% if conditions of MI are met --> increases it
 
p0a = p0+((1/4)*p1); % ensures that it will never be more than 1
p1a = 1-p0a;
feature = 'ok'; % conditions are met for a positive image
else %there was no improvement in MI
 
% increases the probability of negative feedback according to EEG data
% probability of negative feedback = p1 (posterior modelling of the user)
 
p1a = p1+((1/4)*p0); % ensures that it will never be more than 1
p0a = 1-p1a;
feature = 'not ok'; % conditions are met for a negative image
end
 
Prob = [p0a, p1a]; cp = [0, cumsum(Prob)];
r = rand; %random number between 0 and 1
ind = find(r>cp, 1, 'last'); image = Feedback(ind);
filename = filenamesFeedback{1,image+1};
imageDisp = imread(filename);
imageFeed = figure(); imshow(imageDisp);
disp(['Feedback: ', filename]);
 
switch image
case 0
switch feature
case 'ok'
disp('Improvement in MI: Yes');
% likelihood
BinoLikelihood0 = xBinoPrior;
 
case 'not ok' % Goal: do not affect the probability of success that much
disp('Improvement in MI: No');
% Considers the possibility of cognitive internal factors
% for likelihood (beyond sensory sampling)
% with lower weight
 
% Square root the elements of the vector
% Highest weight on proportion for positive image, but does not reach 1
% and nonlinear decrease (slower than linear) towards proportion of negative images
BinoPrior_sqrt = sqrt(xBinoPrior);
% Calculate the sum of the original vector and the squared vector
sum_BinoPrior = sum(xBinoPrior); sum_BinoPrior_squared = sum(BinoPrior_sqrt);
% Calculate the scaling factor
scaling_factor = sum_BinoPrior / sum_BinoPrior_squared;
% Scale the squared vector by the scaling factor
xBinoPrior_1_ok = BinoPrior_sqrt * scaling_factor;
 
% likelihood
BinoLikelihood0 = xBinoPrior_1_ok;
end
% Highlight the distribution that needs to be multiplied by the priors
% to compute posterior
param = BinoLikelihood0;
 
case 1
switch feature
case 'ok' % Goal: do not affect the probability of success that much
disp('Improvement in MI: Yes');
% Considers the possibility of cognitive internal factors
% for likelihood (beyond sensory sampling)
% with lower weight
 
% Square root the elements of the vector
% Highest weight on proportion for negative image, but does not reach 1
% and nonlinear decrease (slower than linear) towards proportion of positive image
BinoPrior_sqrt = sqrt(xBinoPrior);
% Calculate the sum of the original vector and the squared vector
sum_BinoPrior = sum(xBinoPrior); sum_BinoPrior_squared = sum(BinoPrior_sqrt);
% Calculate the scaling factor
scaling_factor = sum_BinoPrior / sum_BinoPrior_squared;
% Scale the squared vector by the scaling factor
xBinoPrior_1_ok = BinoPrior_sqrt * scaling_factor;
 
% likelihood
BinoLikelihood1 = xBinoPrior_1_ok(end:-1:1);
case 'not ok'
disp('Improvement in MI: No');
% likelihood
BinoLikelihood1 = xBinoPrior(end:-1:1);
end
% Highlight the distribution that needs to be multiplied by the priors
% to compute posterior
param = BinoLikelihood1;
end
 
% Output likelihood distribution
figure('WindowState','maximized')
bar(xBinoPrior,param,'FaceAlpha',0.3)
xlabel('Proportion of positive image'); ylabel('Probability of image appearance')
title('Likelihood')
 
% Posterior
Posterior = binoPosterior.*param;
% Scaling factor so that it sums to 1
scaling_factor = 1 / sum(Posterior);
% Scale the squared vector by the scaling factor
binoPosterior = Posterior * scaling_factor;
 
% Save cumulated posterior distributions
binoPosterior_cum{1,t} = binoPosterior;
binoPosterior_cum = binoPosterior_cum(~cellfun('isempty',binoPosterior_cum));
idx_currentbinoPosterior = size(binoPosterior_cum,2);
 
% Output posterior distribution
figure('WindowState','maximized')
bar(xBinoPrior,binoPosterior_cum{1,idx_currentbinoPosterior-1},'FaceAlpha',0.3)
hold on
bar(xBinoPrior,binoPosterior,'FaceAlpha',0.3)
legend('Prior', 'Posterior')
xlabel('Proportion of positive image');ylabel('Probability of image appearance');
 
% Compute new probability of success
 
% Number of trials (from the length of pdf_values - 1)
n = length(binoPosterior) - 1;
% Objective function to minimize
objective_function = @(p) sum((binoPosterior - binopdf(0:n, n, p)).^2);
% Initial guess for p
p_initial = 0.5;
% Use fminsearch to find the optimal p
% Update p0
[p0,fval,exitflag,output] = fminsearch(objective_function, p_initial);
% Display the result
disp(['Posterior probability of success (p): ', num2str(p0)]);
% Update p1
p1 = 1-p0;
 
% Save cumulated p0
p0_cum(1,t) = p0;
 
% Update plot
set(h, 'XData', (1:t)/fs, 'YData', p0_cum(1:t));
drawnow;
end
 
end
 
end
 
end
Buffer-like moving window at: 1 seconds Buffer-like moving window at: 1.5 seconds Buffer-like moving window at: 2 seconds Buffer-like moving window at: 2.5 seconds Buffer-like moving window at: 3 seconds Buffer-like moving window at: 3.5 seconds Buffer-like moving window at: 4 seconds Buffer-like moving window at: 4.5 seconds
Feedback: positive.jpg
Posterior probability of success (p): 0.54775
Buffer-like moving window at: 5 seconds
Feedback: negative.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.52246
Buffer-like moving window at: 5.5 seconds
Feedback: positive.jpg
Improvement in MI: No
Posterior probability of success (p): 0.54297
Buffer-like moving window at: 6 seconds Buffer-like moving window at: 6.5 seconds Buffer-like moving window at: 7 seconds
Feedback: negative.jpg
Improvement in MI: No
Posterior probability of success (p): 0.5
Buffer-like moving window at: 7.5 seconds
Feedback: negative.jpg
Improvement in MI: No
Posterior probability of success (p): 0.46396
Buffer-like moving window at: 8 seconds
Feedback: positive.jpg
Improvement in MI: No
Posterior probability of success (p): 0.48271
Buffer-like moving window at: 8.5 seconds Buffer-like moving window at: 9 seconds Buffer-like moving window at: 9.5 seconds
Feedback: positive.jpg
Improvement in MI: No
Posterior probability of success (p): 0.5
Buffer-like moving window at: 10 seconds
Feedback: positive.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.53115
Buffer-like moving window at: 10.5 seconds
Feedback: negative.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.51504
Buffer-like moving window at: 11 seconds
Feedback: negative.jpg
Improvement in MI: No
Posterior probability of success (p): 0.48594
Buffer-like moving window at: 11.5 seconds Buffer-like moving window at: 12 seconds Buffer-like moving window at: 12.5 seconds
Feedback: positive.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.51328
Buffer-like moving window at: 13 seconds
Feedback: negative.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.5
Buffer-like moving window at: 13.5 seconds
Feedback: negative.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.4874
Buffer-like moving window at: 14 seconds
Feedback: positive.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.51191
Buffer-like moving window at: 14.5 seconds Buffer-like moving window at: 15 seconds Buffer-like moving window at: 15.5 seconds
Feedback: negative.jpg
Improvement in MI: No
Posterior probability of success (p): 0.48867
Buffer-like moving window at: 16 seconds
Feedback: positive.jpg
Improvement in MI: No
Posterior probability of success (p): 0.5
Buffer-like moving window at: 16.5 seconds
Feedback: positive.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.52119
Buffer-like moving window at: 17 seconds Buffer-like moving window at: 17.5 seconds Buffer-like moving window at: 18 seconds
Feedback: negative.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.51035
Buffer-like moving window at: 18.5 seconds
Feedback: negative.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.5
Buffer-like moving window at: 19 seconds
Feedback: negative.jpg
Improvement in MI: Yes
Posterior probability of success (p): 0.49014
Buffer-like moving window at: 19.5 seconds Buffer-like moving window at: 20 seconds
MI through time for graphical visualization
% MI over time
% Sliding 400 ms windows, step size 100 ms
slide_wnd = 400;% in ms
slide_wnd_idx = round(slide_wnd/(1000/fs)/2); %number of samples for 400 ms (200 ms before time of interest and 200 ms after)
Start = slide_wnd_idx/fs; End = (length(GammaAmpl)-slide_wnd_idx)/fs;
 
t_interest = Start:0.1:End; t_interest_idx = zeros(size(t_interest));
for k=1:length(t_interest)
% index of minimum value of difference between time vector and times of
% interest
[min_value,t_interest_idx(k)]=min(abs(timevct-t_interest(k)));
end
 
entropyTot = zeros(3,length(t_interest)); miTot = zeros(2,length(t_interest));
nbinsTot = zeros(1,length(t_interest));
 
for i = 1:1: length(t_interest)
datax = []; datax = GammaAmpl(t_interest_idx(i)-slide_wnd_idx:t_interest_idx(i)+slide_wnd_idx,:);
datay = []; datay = ThetaPhase(t_interest_idx(i)-slide_wnd_idx:t_interest_idx(i)+slide_wnd_idx,:);
[miTot(1,i),entropyTot(:,i), nbinsTot(1,i)] = MI(datax,datay,[]);
end
 
% Define the number of bins as the mean of number of bins across time windows
% Keep the number of bins constant across time windows as it can strongly
% impact the value of MI
n_bins = ceil(mean(nbinsTot)); entropyTot = zeros(3,length(t_interest));
 
for i = 1:1:length(t_interest)
datax = []; datax = GammaAmpl(t_interest_idx(i)-slide_wnd_idx:t_interest_idx(i)+slide_wnd_idx,:);
datay = []; datay = ThetaPhase(t_interest_idx(i)-slide_wnd_idx:t_interest_idx(i)+slide_wnd_idx,:);
[miTot(1,i),entropyTot(:,i), nbinsTot(1,i)] = MI(datax,datay,n_bins);
 
% Having little amount of data may inflate MI.
% Compute inflation factor and subtract it from MI
n = size(datax,1); mutinfo_error = [];
mutinfo_error = (n_bins-1).^2./(2.*n.*log(2));
miTot(2,i) = miTot(1,i)-mutinfo_error;
end
Find indexes of first and last samples of each word
idx_max_tot = []; idx_min_tot = idx_word(1);
for i = 2:length(idx_word)-5
%if next sample is an index farther than 5 samples after sample --> next word
%choose 5 samples for safety (rest period between words is 2 seconds = 96000)
if idx_word(i+1)>(idx_word(i)+5)
idx_max_tot = [idx_max_tot,idx_word(i)];
end
 
%if sample is an index farther than 5 samples after last sample --> new
%word - choose 5 samples for safety (rest period between words is 2 seconds = 96000)
if idx_word(i)>(idx_word(i-1)+5)
idx_min_tot = [idx_min_tot,idx_word(i)];
end
end
 
idx_max_tot(end+1) = idx_word(end);
Plot graph
figure('WindowState','maximized');
L = length(miTot(2,:));
timeWindow = linspace(Start-(slide_wnd_idx/fs), End+(slide_wnd_idx/fs),L);
plot(timeWindow,miTot(2,:), 'LineWidth', 1);
for i = 1:1:size(idx_max_tot,2)
xline((idx_min_tot(i)/fs), ':r', 'LineWidth', 3);
xline((idx_max_tot(i)/fs), ':r', 'LineWidth', 3);
end
xlim([1 20.5])
legend('Mutual information (MI)', 'Time windows of word processing', ...
'position', [0.630373261082504 0.849893819614869 0.150911461139718 0.0492710593452206])
xlabel('Time (s)'); ylabel('MI (bits)');
title(sprintf(['MI between Theta Phase/Gamma Amplitude over time',...
'\n Brain-Brain Coupling']));

Inferences from data

We can see from the last graph that although MI was higher during speech processing than during rest, there was no improvement of MI through time during word processing.
This observation is in line with the lack of improvement of probability of success (probability of positive feedback) over time assessed during the online simulation (see plot h titled "Real-time update of probability of success (p(positive image))" updated online during simulation).
In the context of a learning environment, we could infer that learning was not promoted by context. A new user environment that would better favor learning should be envisioned.